1,040 research outputs found

    Learning to hash for large scale image retrieval

    Get PDF
    This thesis is concerned with improving the effectiveness of nearest neighbour search. Nearest neighbour search is the problem of finding the most similar data-points to a query in a database, and is a fundamental operation that has found wide applicability in many fields. In this thesis the focus is placed on hashing-based approximate nearest neighbour search methods that generate similar binary hashcodes for similar data-points. These hashcodes can be used as the indices into the buckets of hashtables for fast search. This work explores how the quality of search can be improved by learning task specific binary hashcodes. The generation of a binary hashcode comprises two main steps carried out sequentially: projection of the image feature vector onto the normal vectors of a set of hyperplanes partitioning the input feature space followed by a quantisation operation that uses a single threshold to binarise the resulting projections to obtain the hashcodes. The degree to which these operations preserve the relative distances between the datapoints in the input feature space has a direct influence on the effectiveness of using the resulting hashcodes for nearest neighbour search. In this thesis I argue that the retrieval effectiveness of existing hashing-based nearest neighbour search methods can be increased by learning the thresholds and hyperplanes based on the distribution of the input data. The first contribution is a model for learning multiple quantisation thresholds. I demonstrate that the best threshold positioning is projection specific and introduce a novel clustering algorithm for threshold optimisation. The second contribution extends this algorithm by learning the optimal allocation of quantisation thresholds per hyperplane. In doing so I argue that some hyperplanes are naturally more effective than others at capturing the distribution of the data and should therefore attract a greater allocation of quantisation thresholds. The third contribution focuses on the complementary problem of learning the hashing hyperplanes. I introduce a multi-step iterative model that, in the first step, regularises the hashcodes over a data-point adjacency graph, which encourages similar data-points to be assigned similar hashcodes. In the second step, binary classifiers are learnt to separate opposing bits with maximum margin. This algorithm is extended to learn hyperplanes that can generate similar hashcodes for similar data-points in two different feature spaces (e.g. text and images). Individually the performance of these algorithms is often superior to competitive baselines. I unify my contributions by demonstrating that learning hyperplanes and thresholds as part of the same model can yield an additive increase in retrieval effectiveness

    The Interrelationships of Placental Mammals and the Limits of Phylogenetic Inference

    Get PDF
    Placental mammals comprise three principal clades: Afrotheria (e.g., elephants and tenrecs), Xenarthra (e.g., armadillos and sloths), and Boreoeutheria (all other placental mammals), the relationships among which are the subject of controversy and a touchstone for debate on the limits of phylogenetic inference. Previous analyses have found support for all three hypotheses, leading some to conclude that this phylogenetic problem might be impossible to resolve due to the compounded effects of incomplete lineage sorting (ILS) and a rapid radiation. Here we show, using a genome scale nucleotide data set, microRNAs, and the reanalysis of the three largest previously published amino acid data sets, that the root of Placentalia lies between Atlantogenata and Boreoeutheria. Although we found evidence for ILS in early placental evolution, we are able to reject previous conclusions that the placental root is a hard polytomy that cannot be resolved. Reanalyses of previous data sets recover Atlantogenata + Boreoeutheria and show that contradictory results are a consequence of poorly fitting evolutionary models; instead, when the evolutionary process is better-modeled, all data sets converge on Atlantogenata. Our Bayesian molecular clock analysis estimates that marsupials diverged from placentals 157-170 Ma, crown Placentalia diverged 86-100 Ma, and crown Atlantogenata diverged 84-97 Ma. Our results are compatible with placental diversification being driven by dispersal rather than vicariance mechanisms, postdating early phases in the protracted opening of the Atlantic Ocean

    Acute kidney disease and renal recovery : consensus report of the Acute Disease Quality Initiative (ADQI) 16 Workgroup

    Get PDF
    Consensus definitions have been reached for both acute kidney injury (AKI) and chronic kidney disease (CKD) and these definitions are now routinely used in research and clinical practice. The KDIGO guideline defines AKI as an abrupt decrease in kidney function occurring over 7 days or less, whereas CKD is defined by the persistence of kidney disease for a period of > 90 days. AKI and CKD are increasingly recognized as related entities and in some instances probably represent a continuum of the disease process. For patients in whom pathophysiologic processes are ongoing, the term acute kidney disease (AKD) has been proposed to define the course of disease after AKI; however, definitions of AKD and strategies for the management of patients with AKD are not currently available. In this consensus statement, the Acute Disease Quality Initiative (ADQI) proposes definitions, staging criteria for AKD, and strategies for the management of affected patients. We also make recommendations for areas of future research, which aim to improve understanding of the underlying processes and improve outcomes for patients with AKD

    Phylogenomics of the Reproductive Parasite Wolbachia pipientis wMel: A Streamlined Genome Overrun by Mobile Genetic Elements

    Get PDF
    The complete sequence of the 1,267,782 bp genome of Wolbachia pipientis wMel, an obligate intracellular bacteria of Drosophila melanogaster, has been determined. Wolbachia, which are found in a variety of invertebrate species, are of great interest due to their diverse interactions with different hosts, which range from many forms of reproductive parasitism to mutualistic symbioses. Analysis of the wMel genome, in particular phylogenomic comparisons with other intracellular bacteria, has revealed many insights into the biology and evolution of wMel and Wolbachia in general. For example, the wMel genome is unique among sequenced obligate intracellular species in both being highly streamlined and containing very high levels of repetitive DNA and mobile DNA elements. This observation, coupled with multiple evolutionary reconstructions, suggests that natural selection is somewhat inefficient in wMel, most likely owing to the occurrence of repeated population bottlenecks. Genome analysis predicts many metabolic differences with the closely related Rickettsia species, including the presence of intact glycolysis and purine synthesis, which may compensate for an inability to obtain ATP directly from its host, as Rickettsia can. Other discoveries include the apparent inability of wMel to synthesize lipopolysaccharide and the presence of the most genes encoding proteins with ankyrin repeat domains of any prokaryotic genome yet sequenced. Despite the ability of wMel to infect the germline of its host, we find no evidence for either recent lateral gene transfer between wMel and D. melanogaster or older transfers between Wolbachia and any host. Evolutionary analysis further supports the hypothesis that mitochondria share a common ancestor with the α-Proteobacteria, but shows little support for the grouping of mitochondria with species in the order Rickettsiales. With the availability of the complete genomes of both species and excellent genetic tools for the host, the wMel–D. melanogaster symbiosis is now an ideal system for studying the biology and evolution of Wolbachia infections

    Photochemically-produced SO2_2 in the atmosphere of WASP-39b

    Get PDF
    Photochemistry is a fundamental process of planetary atmospheres that regulates the atmospheric composition and stability. However, no unambiguous photochemical products have been detected in exoplanet atmospheres to date. Recent observations from the JWST Transiting Exoplanet Early Release Science Program found a spectral absorption feature at 4.05 μ\mum arising from SO2_2 in the atmosphere of WASP-39b. WASP-39b is a 1.27-Jupiter-radii, Saturn-mass (0.28 MJ_J) gas giant exoplanet orbiting a Sun-like star with an equilibrium temperature of \sim1100 K. The most plausible way of generating SO2_2 in such an atmosphere is through photochemical processes. Here we show that the SO2_2 distribution computed by a suite of photochemical models robustly explains the 4.05 μ\mum spectral feature identified by JWST transmission observations with NIRSpec PRISM (2.7σ\sigma) and G395H (4.5σ\sigma). SO2_2 is produced by successive oxidation of sulphur radicals freed when hydrogen sulphide (H2_2S) is destroyed. The sensitivity of the SO2_2 feature to the enrichment of the atmosphere by heavy elements (metallicity) suggests that it can be used as a tracer of atmospheric properties, with WASP-39b exhibiting an inferred metallicity of \sim10×\times solar. We further point out that SO2_2 also shows observable features at ultraviolet and thermal infrared wavelengths not available from the existing observations.Comment: 39 pages, 14 figures, accepted to be published in Natur

    Photochemically produced SO2 in the atmosphere of WASP-39b

    Get PDF
    Photochemistry is a fundamental process of planetary atmospheres that regulates the atmospheric composition and stability1. However, no unambiguous photochemical products have been detected in exoplanet atmospheres so far. Recent observations from the JWST Transiting Exoplanet Community Early Release Science Program2,3 found a spectral absorption feature at 4.05 μm arising from sulfur dioxide (SO2) in the atmosphere of WASP-39b. WASP-39b is a 1.27-Jupiter-radii, Saturn-mass (0.28 MJ) gas giant exoplanet orbiting a Sun-like star with an equilibrium temperature of around 1,100 K (ref. 4). The most plausible way of generating SO2 in such an atmosphere is through photochemical processes5,6. Here we show that the SO2 distribution computed by a suite of photochemical models robustly explains the 4.05-μm spectral feature identified by JWST transmission observations7 with NIRSpec PRISM (2.7σ)8 and G395H (4.5σ)9. SO2 is produced by successive oxidation of sulfur radicals freed when hydrogen sulfide (H2S) is destroyed. The sensitivity of the SO2 feature to the enrichment of the atmosphere by heavy elements (metallicity) suggests that it can be used as a tracer of atmospheric properties, with WASP-39b exhibiting an inferred metallicity of about 10× solar. We further point out that SO2 also shows observable features at ultraviolet and thermal infrared wavelengths not available from the existing observations

    Targeting DNA Damage Response and Replication Stress in Pancreatic Cancer

    Get PDF
    Background and aims: Continuing recalcitrance to therapy cements pancreatic cancer (PC) as the most lethal malignancy, which is set to become the second leading cause of cancer death in our society. The study aim was to investigate the association between DNA damage response (DDR), replication stress and novel therapeutic response in PC to develop a biomarker driven therapeutic strategy targeting DDR and replication stress in PC. Methods: We interrogated the transcriptome, genome, proteome and functional characteristics of 61 novel PC patient-derived cell lines to define novel therapeutic strategies targeting DDR and replication stress. Validation was done in patient derived xenografts and human PC organoids. Results: Patient-derived cell lines faithfully recapitulate the epithelial component of pancreatic tumors including previously described molecular subtypes. Biomarkers of DDR deficiency, including a novel signature of homologous recombination deficiency, co-segregates with response to platinum (P < 0.001) and PARP inhibitor therapy (P < 0.001) in vitro and in vivo. We generated a novel signature of replication stress with which predicts response to ATR (P < 0.018) and WEE1 inhibitor (P < 0.029) treatment in both cell lines and human PC organoids. Replication stress was enriched in the squamous subtype of PC (P < 0.001) but not associated with DDR deficiency. Conclusions: Replication stress and DDR deficiency are independent of each other, creating opportunities for therapy in DDR proficient PC, and post-platinum therapy

    A Universal Power-law Prescription for Variability from Synthetic Images of Black Hole Accretion Flows

    Get PDF
    We present a framework for characterizing the spatiotemporal power spectrum of the variability expected from the horizon-scale emission structure around supermassive black holes, and we apply this framework to a library of general relativistic magnetohydrodynamic (GRMHD) simulations and associated general relativistic ray-traced images relevant for Event Horizon Telescope (EHT) observations of Sgr A*. We find that the variability power spectrum is generically a red-noise process in both the temporal and spatial dimensions, with the peak in power occurring on the longest timescales and largest spatial scales. When both the time-averaged source structure and the spatially integrated light-curve variability are removed, the residual power spectrum exhibits a universal broken power-law behavior. On small spatial frequencies, the residual power spectrum rises as the square of the spatial frequency and is proportional to the variance in the centroid of emission. Beyond some peak in variability power, the residual power spectrum falls as that of the time-averaged source structure, which is similar across simulations; this behavior can be naturally explained if the variability arises from a multiplicative random field that has a steeper high-frequency power-law index than that of the time-averaged source structure. We briefly explore the ability of power spectral variability studies to constrain physical parameters relevant for the GRMHD simulations, which can be scaled to provide predictions for black holes in a range of systems in the optically thin regime. We present specific expectations for the behavior of the M87* and Sgr A* accretion flows as observed by the EHT

    GA4GH: International policies and standards for data sharing across genomic research and healthcare.

    Get PDF
    The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits
    corecore